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ABSTRACT 

Oligonucleotides as short as 6nt in length have 
been shown to bind specifically and tightly to 
proteins and affect their biological function. Yet, 
sparse structural data are available for correspond- 
ing complexes. Employing a recently developed 
hexanucleotide array, we identified hexadeoxyribo- 
nucleotides that bind specifically to the 3C protease 
of hepatitis A virus (HAV 3C pro ). Inhibition assays 
in vitro identified the hexanucleotide 5 -GGGGGT-3 
(G 5 T) as a 3C pro protease inhibitor. Using 1 H 
NMR spectroscopy, G 5 T was found to form a 
G-quadruplex, which might be considered as a 
minimal aptamer. With the help of 1 H, 15 N-HSQC ex- 
periments the binding site for G 5 T was located to 
the C-terminal p-barrel of HAV 3C pro . Importantly, 
the highly conserved KFRDI motif, which has previ- 
ously been identified as putative viral RNA binding 
site, is not part of the G 5 T-binding site, nor 
does G 5 T interfere with the binding of viral RNA. 
Our findings demonstrate that sequence-specific 
nucleic acid-protein interactions occur with oligo- 
nucleotides as small as hexanucleotides and 
suggest that these compounds may be of pharma- 
ceutical relevance. 



INTRODUCTION 

Specific protein-nucleic acid interactions are essential 
for many biological processes. The structural diversity of 
RNA and DNA generates a plethora of structural motifs 
that serve as specific recognition elements e.g. in gene 



regulation and has led to the development of so called 
aptamer technologies that aim at nucleic acid sequences 
with tailored binding specificities (1-6). Based on these 
discoveries, the idea has been advanced that even 
smaller oligonucleotides may accomplish specific inter- 
actions with protein targets and first examples of such 
interactions were reported, in some cases with DNA oligo- 
nucleotides as small as hexamers (7-10). To further test 
and substantiate this concept, we investigated the ability 
of the viral protease of hepatitis A virus, the 3C protease 
(HAV 3C pro , picornain 3C, EC 3.4.22.28) to bind to DNA 
hexanucleotides. 

HAV belongs to the family of picornaviridae, all of 
which possess a 3C protease. This enzyme plays a 
central role in the picornaviridae viral life cycle and 
serves a dual purpose, it is the major protease that 
processes the viral polyprotein and it binds to regulatory 
structural elements of the 5'-untranslated region of the 
viral RNA, thereby controlling viral genome synthesis 
(11). The 3C pro proteolytic activity has been investigated 
in detail in numerous studies aiming at the development of 
anti-viral drugs. Three-dimensional structures of 3C pro are 
available from X-ray and NMR analyses for a number of 
picornaviruses including HAV (12-14), some complexed 
with substrate peptides or inhibitors (15-18). Binding of 
viral RNA to 3C pro , on the other hand, is much less well 
understood although it is essential for viral genome repli- 
cation. Mutational analyses revealed that a highly 
conserved KFRDI motif in 3C pro is critical for RNA 
binding (19,20), and recent structural studies shed light 
on the details of the 3C pro /RNA interaction at atomic 
resolution (21,22). The binding site for viral RNA with 
the KFRDI motif is located opposite the 3C pro proteolytic 
cleft. For coxsackievirus B3, the 3C pro binding site has 
also been mapped on the surface of the viral RNA (23). 
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We set out to determine whether small oligonucleotides 
may constitute efficient ligands for nucleic acid binding 3C 
proteases employing HAV 3C pro as an example. To avoid 
cysteine-mediated dimerization of the protein the C24S 
mutant of HAV 3C pro was used. We employed a 
hexanucleotide chip technology (24) to search the DNA 
hexanucleotide sequence space for sequences that bind to 
HAV 3C pro . We then used NMR spectroscopy and bio- 
chemical assays to locate the binding site of one of the 
binding hexanucleotides, G 5 T, and investigated the con- 
formation adopted by this guanine-rich oligonucleotide. A 
NMR assignment of the 25 kDa 3C pro is available in the 
literature (25). This study was performed at acidic pH at 
which the enzyme has almost no protease activity. We 
therefore reassigned the protein at physiological pH em- 
ploying standard triple resonance experiments and 
5 H/ 15 N/ 13 C labeling. We used 'H, 15 N-HSQC spectra to 
identify amino acids that were affected by G 5 T. 
Surprisingly, G 5 T and other hexanucleotides identified in 
the array did not mimic the viral RNA, nor did they inter- 
fere with the HAV 3C pro -RNA complex in a gel shift 
assay. Instead, we found that the DNA hexanucleotides 
inhibit the proteolytic activity of HAV 3C pro , identifying 
these compounds as a possible starting point for antiviral 
drug development. 
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Figure 1. Sequences of array-bound hexanucleotides with strongest 
binding to HAV 3C pro . Related species are grouped (I-IV) and their 
major sequence characteristics are depicted on the right hand panel. 
Asterisks indicate sequences present within the HAV 5'-UTR: 
CCGGAG (pos. 24-29) and AGGCTA [pos. 90-95, numbers according 
to (20)]. 



MATERIALS AND METHODS 

Oligonucleotides 

All oligonucleotides were purchased from Biomers (Ulm, 
Germany) at HPLC grade purity. Beside the strands listed 
in Figure 1, the following sequences showed no signal 
in the hexanucleotide array with 3C pro and were 
used as controls: 5'-TAGGAC-3', 5'-GGGTGG-3', 5'-A 
CTACA-3'. 

Expression and purification of HAV 3C pro 

All experiments were conducted using the C24S mutant 
of HAV 3C pro . The protein was expressed and an initial 
purification step was conducted as described previously 
(27). Protein samples were further purified using high- 
resolution cation exchange and size exclusion chromatog- 
raphy. Nearly, 20-30 mg of protein was applied to a 6-ml 
Resolve S cation exchange column at a flow rate of 
one column volume per minute. The protein was then 
eluted using a gradient of 0 to 1 M NaCl in lOmM potas- 
sium phosphate buffer, pH 7.4. Under these conditions, 
the NaCl concentration at which elution of 3C pro occurred 
was 116mM. The main peak was further purified 
using a HiPrep 26/60 Sephacryl S-300 HR size exclusion 
column at a flow rate of 1 ml/min in lOmM potassium 
phosphate, pH 7.4. Protein samples were checked for 
purity and activity using polyacrylamide gel electro- 
phoresis (SDS-PAGE) and a proteolytic activity assay 
(vide infra). 

Fluorescence labeling of HAV 3C pro 

Fluorescent labeling of the lysine residues HAV 3C pro was 
performed with the Alexa Fluor 488 Monoclonal 



Antiboby Labeling Kit (Invitrogen, Carlsbad, CA, USA) 
according to the manufactures instructions. Briefly, 50 ug 
of PBS buffered HAV 3C pro in a total volume of 90 ul 
was supplemented with 0.1 M sodium bicarbonate to pH 
~8.3 before adding Alexa Fluor 488 reactive dye. After 
60min of incubation at room temperature the protein 
solution was dialyzed over night at 4°C against PBS 
using Spectra/Por dialysis membrane (Roth, Karlsruhe, 
Germany) with a cut off ~3500 Da. Subsequently, 
labeled protein was analyzed on 10% SDS-PAGE 
followed by determination of the labeling efficiency on a 
Phospholmager Typhoon 8600 (Amersham Biosciences, 
Freiburg, Germany) with a 526-nm short pass filter. 

Analysis of HAV 3C pro on a hexamer array 

For microarray analysis, a hexamer array representing the 
complete hexameric sequence space (4096 hexameric 
oligonucleotides) was used (24). The array was blocked 
with 2% (w/v) casein in buffer (50 mM Tris/HCl pH 8.0, 
5 mM KC1, 5 mM MgCl 2 ) for 4 h at room temperature and 
washed five times with 25 mM HEPES pH 7.4 before 
starting incubation with 600 ul of 1.3 uM Alexa 488 
labeled HAV 3C pro for 5min at room temperature. After 
washing five times for 5 s with 25 mM HEPES pH 7.4, and 
once for 2 s with water and rinsing for 1 s with absolute 
ethanol the array was scanned on Phospholmager 
Typhoon 8600 with a 526 nm short pass filter. Spot 
intensities were quantified with the program Image Quant. 

Proteolytic activity assay 

The proteolytic activity of HAV 3C pro was measured with 
the peptide substrate Ac-ELRTQ-pNA as described 
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previously (27). For the ranking of hexanucleotides 
according to their inhibitory activity 6.4 uM 3C pro and 
32 uM hexamer (monomeric concentration) in 50mM 
HEPES pH 7.4 (adjusted with 1M KOH, yielding a 
final K + concentration of 25mM) were incubated 
for five minutes at room temperature in a volume of 
150 ul before addition of 1.67 mM peptide substrate 
Ac-ELRTQ-pNA. Measurement of the kinetics was per- 
formed on a spectrophotometer (Beckman DU-600, 
Fullerton, CA, USA) at 405 nm for 5min taking values 
every 30 s. The AOD/min was calculated in the linear 
interval and the cleavage activity of 3C pro in the 
presence of hexamer was compared to the activity in the 
absence of hexameric oligonucleotides which was set to 
100%. For comparison of leupeptin and G 5 T 6.2 uM 
3C pro , 32 uM leupeptin and 32 uM G 5 T were used. 
For the evaluation of ionic strength dependency on the 
inhibitory activity of G 5 T 6.2 uM 3C pro and 32 uM G 5 T 
were used, and the ionic strength was adjusted using 
lOmM KC1, 10 mM NaCl for one sample and 50 mM 
KC1, 50 mM NaCl in another sample. For the 
Lineweaver-Burk plot 33 uM 3C pro samples without in- 
hibitor and G 5 T concentrations of 8.5, 17.8 and 26.7 (iM 
were used. 



HAV RNA samples for gel shift analysis 

The plasmid pHAV/7 harboring the sequence of the 
attenuated HAV strain HM175 was used as a template 
of RNA transcripts (41,42). After linearization of 
pHAV/7 with Sspl, transcription in vitro was performed 
in a 50 ul solution containing 600 ng template DNA and 
200 U of SP6 RNA polymerase in the presence of 50 uCi 
[a- 32 P] CTP, lOOnmol of NTPs and 1 U of RNase inhibi- 
tor for 2h at 37°C, followed by treatment with 10 U of 
DNasel for 20min at 37°C. Subsequently, the transcript 
was purified by phenol/chloroform extraction and 
gelfiltration (G50 column). The quality of transcripts 
was checked on 6% native polyacrylamide gel. 



Gel shift assay with radiolabeled HAV RNA 

RNA-protein binding reactions were performed as 
described elsewhere (45). Briefly, a 15ul reaction mixture 
containing 0.4 nM 32 P-labeled HAV 5'-UTR RNA 
(154 nt), 25.6 uM 3C pro and 50 uM hexamer was incubated 
in the presence of 20 U of RNase inhibitor (RiboLock, 
Fermantas) in binding buffer [5 mM HEPES pH 7.9, 
25 mM KC1, 2mM MgCl 2 , 1.75 mM ATP, 6mM DTT 
0.05 mM PMSF, 0.167mg/ml tRNA, 5% (v/v) glycerol] 
for 20min at 37°C. The reaction mixture was supple- 
mented with 5 ul of loading buffer [50% (v/v) glycerol, 
1 mM EDTA, 0.25% bromphenol blue] and analyzed 
on a 6% native polyacrylamide gel that had been 
prerun for 45min at 4°C and 12V/cm. Electrophoresis 
was conducted at 17V/cm at 4°C until the bromphenol 
blue marker had migrated to a position of 2/3 of 
the gel length. The gels were dried and subjected to 
autoradiography. 



Gel assay analysis of G-quartet structures 

The analysis of G-quartet structures of hexameric 
species was performed as described previously (28). 
Briefly, 500pmol of hexanucleotide incubated in the 
presence or absence of lOmM KC1 for 30min at 95° C 
and 60 min at 4°C before adding an equal amount of 
25% (v/v) Ficoll 400 in TBE buffer. Samples were 
loaded on a 20% native polyacrylamide gel that had 
been prerun for 30 min and 12V/cm at 4°C. 
Electrophoresis was carried out at 4°C with 17V/cm for 
about 1.5 h. The detection of the nucleic acids was per- 
formed with Stains All (Sigma-Aldrich, Deisenhofen, 
Germany) according to manufactures protocol. 

Assay for HAV-driven gene expression 

Fifteen thousand Huh-T7 cells were seeded in 96-well 
plates and cultured overnight in Dulbecco modified eagle 
medium (DMEM) supplemented with 10% (v/v) fetal calf 
serum, 2mM glutamine, 100 U/ml penicillin and 100 ug/ 
ml streptomycin. Then, a 50 ul mixture of 50 ng of a 
replication-competent pT7-18f-Luc containing HAV 
sequence in which the PI domain had been replaced by 
firefly luciferase sequences (42), 0.1 ng of phRL-SV40 
coding for Renilla luciferase serving as transfection 
control as well as 5 uM hexamer and 10 ul 
Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA) 
was added to the cell monolayer under serum-free condi- 
tions in Opti-MEM (Invitrogen) and transfection was 
continued for 4h. The medium was replaced by 
serum-containing DMEM. As a negative control a trans- 
fection of 1 ng firefly luciferase coding plasmid (pGL3), 
0.1 ng phRL-SV40 and 5 uM hexamer was incubated in 
parallel. After 72 h, the cells were washed with 100 ul 
PBS and lysed with 20 ul passive lysis buffer (Promega, 
Madison, WI, USA). For determination of luciferase 
activity, samples were measured with dual luciferase 
assay (Promega) on an Anthos Lucy 3 luminometer 
(Mikrosysteme GmbH, Krefeld, Germany). 

Sample preparation for NMR resonance assignment 

NMR samples for backbone assignment via triple reson- 
ance experiments contained 0.2 mM 2 H, 13 C, 1 N labeled 
3C pro in a total volume of 200 ul 200 mM deutero-Tris pH 
7.4, 10 mM deutero-DTT and 10% D 2 0. The percentage 
of incorporated deuterium was roughly 75% for the H 01 
protons. TROSY variants of the HNCO, HN(CA)CO, 
HNCACB and HN(CO)CACB experiments were 
acquired at 30°C on a Bruker Avance 700 MHz spectrom- 
eter fitted with a TXI z-gradient cryoprobe. For the 
acquisition of paramagnetic relaxation enhancement 
(PRE) effects, a 0.2 mM sample of 15 N labeled 3C pro 
was labeled with the spin label 5'-(2,2,5,5-tetramethyl-2,5- 
dihydro- 1 H-pyrrol-3-yl)methyl methanesulfonothioate 
(MTSL) following the method described by Battiste and 
Wagner (46), with the modification that unreacted 3C pro 
was not removed and a PD-10 column was used for the 
removal of unreacted MTSL. The spectra used for the 
analysis of paramagnetic relaxation enhancement (PRE) 
effects were acquired at 37°C on a Bruker DRX 500 MHz 
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spectrometer equipped with a TXI z-gradient probehead 
and the sample was reduced in situ with a 3-fold excess of 
ascorbic acid prior to repeating the measurement. All 
spectra were processed using NMRPipe (47) with linear 
prediction in the indirect dimensions and signal intensities 
were determined through the use of the nLinS module of 
NMRPipe. Sparky (T. D. Goddard and D. G. Kneller, 
SPARKY 3, University of California, San Francisco) 
was used for the backbone assignment. 

DNA oligonucleotide samples for 'H NMR 

About 3.4 mg of G 5 T (Biopolymers) were dissolved in 
108 ul of D 2 0 to give a 16.7 mM stock solution and 
stored at -20°C. For ID 'H NMR measurements, 5ul 
of this stock was diluted with 195 ul of 50 mM potassium 
phosphate pH 7.5, 100 mM sodium chloride and 10% 
D 2 0 to give a DNA concentration of 417 uM. The pH 
was adjusted to 7.5 using 1 M NaOH and the sample 
was transferred to a 3 mm (OD) NMR tube (Bruker 
Biospin Match system). Six ID spectra were collected 
between 5 and 30° C using 5°C steps to observe sharpening 
of amine peaks. 

15 N-HAV 3C pro for binding study 

A sample containing 100 uM uniformly 15 N labeled 3C pro 
in 50 mM potassium phosphate, 150mM sodium chloride, 
2mM deutero-DTT, 0.25 mM deutero-EDTA, pH 7.5 and 
10% D 2 0 was prepared and transferred to a 3 mm (OD) 
NMR tube (see above). For the 'H, 15 N-HSQC titration, 
G 5 T was added to yield 3C pro :G 5 T ratios of 1:4, 1:20 and 
1:40 (monomeric G 5 T concentrations). The pH of each 
NMR sample was checked and, if necessary, adjusted 
prior to spectra collection. Spectra were collected at 
30°C on a 500 MHz Bruker DRX spectrometer equipped 
with a cryogenic probe. 

Diffusion time measurements 

Diffusion ordered spectroscopy (DOSY) spectra for 3C pro , 
G 5 T and the 3C pro :G 5 T complex was acquired at 37°C 
on the above mentioned 500 MHz NMR spectrometer 
using an STE sequence with bipolar gradients and 
WATERGATE solvent suppression. For each sample, 
ID 'H NMR spectra with 1024 scans were recorded at 
10 different gradient strengths reaching from 0.96 to 
45.7 G/cm while the diffusion time (A) and gradient 
length (8/2) were kept constant. The gradient length was 
1.8 ms for each sample and the diffusion time was varied 
between 100 and 170 ms for different diffusion coefficients 
to sample the whole decay curve for every sample. 
The relative signal intensities were plotted as functions 
of the gradient strength and the curves were fitted accord- 
ing to the equation: 

I=I 0 exp(-Dy 2 g 2 8 2 (A - 8/3 - r/2) 

where lis the signal intensity, D the translational diffusion 
coefficient, y the proton gyro-magnetic ratio, g is the 
gradient strength, 8 the gradient length, A the diffusion 
time and r the time between gradients of a gradient pair 
(218 us in our measurements). For samples containing 



3C pro , the two most upheld methyl resonances were 
chosen for the analysis because they did not overlap 
with other signals. For free G 5 T, several aromatic 
signals were chosen and these displayed very little 
variation. A 2mM sample of hen egg lysozyme was 
first used to verify correct z-gradient calibration, which 
is critical for accurate DOSY measurements. At 20°C, 
a translational diffusion coefficient of 11.2 ± 0.1 x 
10~ 7 cm 2 /s was obtained for this sample, which is in 
good agreement with published values (48). 



RESULTS 

Screening hexanucleotide-3C pro interactions 

First, we systematically investigated possible interactions 
between HAV 3C pro and DNA hexanucleotides using an 
array that contains the complete hexanucleotide sequence 
space (24). This array is designed such that 
hexanucleotides are attached to the chip surface via a 
non-nucleic acid linker attached to the 3' terminus of the 
DNA. In terms of specificity, interactions between 
proteins and surface-bound hexanucleotides have been 
shown to be compatible with interactions in solution (9). 
In the case of 3C pro , array binding studies revealed a 
number of binding oligonucleotides which were grouped 
according to their primary sequence (Figure 1). The 
majority of these hexanucleotides exhibit high purine 
content. The consensus sequence of group II indicates 
their potential to form G-quadruplexes. Two out of 
three sequences shown in group IV are palindromes and 
could form antiparallel double strands. In the 
hexanucleotide array, formation of higher order structures 
should be sterically possible because the hexanucleotides 
are attached to the chip surface via a 38 atom flexible 
linker (24). 

G 5 T does not interfere with viral RNA binding 

To investigate whether the hexanucleotides identified in 
the DNA array bind to the viral RNA binding site of 
3C pro , we studied binding of the in vitro transcribed 
5'-terminal 154nt of the 5'-UTR of HAV (26) to 3C pro 
in the presence of hexanucleotides (Figure 2). No competi- 
tive effects were observed in this study, i.e. addition of 
hexanucleotides did not cause release of the RNA tran- 
script from the complex with 3C pro . In the presence of 
G 5 T (group II), the experiment resulted in a super-shift 
of the complex formed between HAV RNA and 3C pro 
indicating a ternary complex formed by RNA, 3C pro 
and G 5 T. Furthermore, no influence of G 5 T on the con- 
centration dependency of HAV RNA/3C pro binding was 
observed, and in the absence of 3C pro no complex was 
formed between HAV RNA and G 5 T (data not shown). 
In summary, these data suggest binding of G 5 T to 3C pro at 
a site that is distinct from the putative binding site for viral 
RNA (the KFRDI motif) and does not interfere with viral 
RNA binding. None of the hexanucleotides representative 
for groups I, III and IV (CCGGAG, AGGCTA, CGGCG 
A) interfered with viral RNA binding and no super shift 
was detected for these hexanucleotides. 
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Figure 2. Supershift of the HAV 5'-UTR and 3C pro in the presence of 
G 5 T. An in vitro transcribed HAV 5'-UTR (154nt) was shifted in the 
presence of 3C pro (lanes #1 versus #2). This signal is further shifted in 
the presence of G 5 T (lane #4) but not in the presence of other hits from 
the hexanucleotide array (lanes #3, #5 and #6) or control 
hexanucleotides (lanes #7 and #8) indicating a higher molecular 
weight complex, presumably a ternary complex formed by 3C pro , G5T 
and the viral 5'-UTR RNA. 



The protease activity of 3C pro is decreased in 
the presence of G 5 T 

Finally, the interaction of G 5 T and other hexanucleotides 
with 3C pro was further investigated in functional terms by 
testing the protease activity in vitro in the presence of these 
oligonucleotide ligands (27). This experiment showed 
decreased cleavage activity in the presence of G 5 T, and 
to a smaller extent in the presence of G5A, but almost 
no decrease for the closely related sequences TG 5 
and AG 5 or any of the hexanucleotides representing the 
different groups listed in Figure 1 (Figure 3). Further, 
the deoxyribose backbone of G 5 T seems to contribute 
to the inhibition of 3C as indicated by the inhibition 
data of chemically modified and extended derivatives 
(Figure 3). 

Among all oligonucleotides tested, G 5 T exhibited 
the highest inhibitory potency and underwent specific 
binding to 3C pro at a site distinct from the putative 
RNA binding site. Therefore, this hexanucleotide was 
further investigated from a structural point of view. 
Figure 4 shows a comparison of G 5 T's inhibitory 
activity with the activity of leupeptin, an established re- 
versible inhibitor for cysteine proteases. In a low salt 
buffer G 5 T achieved an inhibitory effect of 60% at a 
molar concentration at which leupeptin did only reduce 
the protease activity by about 10%. If the quadruplex 
structure of G 5 T is taken into consideration (see below), 
its inhibitory activity is about 24 times higher than the 
inhibitory activity of leupeptin. However, when repeating 
the activity test with different salt concentrations we 
found a strong salt dependency of the G 5 T inhibitory 
capacity (Figure 4). Addition of lOmM NaCl and 
lOmM KC1 reduced the G 5 T inhibitory activity from 
60% to 47%, and addition of 50 mM NaCl and 50 mM 
KC1 only 28% inhibitory activity was observed for G 5 T. 
Because the protease activity of 3C pro itself is not altered 
within this salt concentration range we conclude that the 





1 10 ■ - 




100-- 




90- 


0 s * 


80- 






> 


70- 


0 




n 






60- 




50- 




40- 



r*i r*1 



r*i 1—1 



o° o° & & 0° ^ A r o r cT ^ 
* / <f o° o° o° o° r° o° r° 




Figure 3. Inhibition of HAV 3C protease by unmodified (top) and 
modified (bottom) hexameric oligonucleotides. Hexamer sequences are 
indicated from 5' to 3' position. Oligonucleotides were used in a 5-fold 
molar excess over protein. The activity of the enzyme in the absence of 
hexamer was set to 100%. Data were compared using unpaired / test 
and the program Prims 4 (GraphPad Software). Indicated are mean 
values ± SEM; ***P< 0.001. The buffer used was 50 mM HEPES at 
pH 7.4 (adjusted with 1 M KOH, yielding a final K + concentration of 
25 mM). 



complex between G 5 T and 3C pro must be mediated by 
ionic interactions to a substantial degree. 

In an attempt to classify the inhibitory mechanism by 
which G5T inhibits the 3C pro protease activity, we 
measured the inhibitory activity at different G 5 T concen- 
trations and analyzed the data using a Lineweaver-Burk 
plot (Supplementary Figure SI). Unfortunately, these data 
do not allow unambiguous conclusions on the mechanism 
of inhibition. 

Higher order structure of G 5 T — quadruplex formation 

For more detailed binding studies, we first focused on 
the solution structure of the ligands G 5 T and G 5 A. 
Because of the high guanosine content of group II 
sequences (Figure 1), the ability of G 5 T and G 5 A to 
form higher order structures such as G-quadruplexes 
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Figure 4. Comparison of inhibitory activities of leupeptin and G 5 T and effect of ionic strength on the inhibitory activity of G 5 T. Black triangles: 
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was studied. We initially noticed the ability of these 
oligonucleotides to oligomerize on a routine denaturing 
gel employed to check the purity of the hexanucleotides 
in the presence of 8 M urea (data not shown). 
Oligomerization was subsequently verified by NMR spec- 
troscopy, which revealed characteristics typically observed 
for G-quadruplexes. The K + -dependency of quadruplex 
formation was tested in a gel assay following an estab- 
lished protocol (28), and it was found that both G 5 T 
and G 5 A form higher order structures, presumably 
quadruplexes, in the presence of K + , as indicated by 
slow migration (complex A in Figure 5A) that is charac- 
teristic for high molecular weight complexes and that was 
much less pronounced for TG5 and AG5 (complex B in 
Figure 5A). The control hexanucleotide G 3 TG 2 in which 
the consecutive G stretch is interrupted migrated faster 
in native polyacrylamide gels in the presence of K + as 
G 5 T/G 5 A which is consistent with a monomeric form 
(Figure 5A). 

Additional evidence for the existence of a higher order 
structure of G5T was obtained from ID 'H NMR spectra 
of the oligonucleotide obtained at different temperatures 
in the presence of 50 mM K + or Na + . When compared to 
monomeric control hexanucleotides (CCGGAG and AG 
GCTA) spectra of G 5 T exhibited significant peak broad- 
ening, stronger spectral overlap, and a distinctive set of 
resonances around 1 1 ppm (Figure 5B). Similar signals 
had been identified as imino resonances in previous 
studies of guanine-rich DNA sequences. These signals 
are characteristic for guanine quartets where extensive 
Hoogsteen hydrogen bonding protects guanine imino 
and amine protons from exchange with water protons 
(29,30). ID 'H NMR spectra were recorded between 
6°C and 65°C for both samples containing either 
100 mM Na + or lOOmM K + to monitor the stability of 



the higher order structure of G 5 T. No significant differ- 
ence was observed between the two samples or the differ- 
ent temperatures (data not shown) except for a set of 
broad signals around 9 ppm, which became more intense 
at lower temperatures. The latter set of resonances is char- 
acteristic for amino protons that suffer line broadening as 
a consequence of exchange with water. 

Although overlapping and thus difficult to quantify, the 
number of imino resonances in the ID 'H NMR spectra 
appeared to be larger than five (Figure 5B middle 
spectrum). In a parallel G-quadruplex chemically equiva- 
lent nucleotides at identical positions in the four different 
strands would also be magnetically equivalent and conse- 
quently only five imino resonances would be expected in 
total (i.e. there would be only one imino peak for all G2 
residues, one for all G3 residues and so on) (29). Thus, a 
more complicated scenario seems to be the case for G 5 T. 
2D 'H, 'H-TOCSY and NOESY spectra were recorded at 
6 and 30°C in the presence of 100 mM K + to gain further 
insight into the preferred quadruplex conformation. 
Strong spectral overlap and multiple conformations 
complicated the analysis of 2D spectra but one prevalent 
DNA conformation could nevertheless be identified. For a 
tentative assignment of G 5 T, the T6 methyl group was 
used as a starting point, readily allowing the assignment 
of T6 and G5. G4, G3 and G2 were assigned based on 
internal and sequential H8-H1' cross peaks (Figure 6A). 
The G2-H8 resonance, however, exhibited cross peaks 
with two sets of resonances in the HI' region (in 
addition to its internal H8-H1' cross peak) as well as for 
further deoxyribose resonances. Because none of the two 
HI' resonances thus associated with G2 exhibited further 
sequential connectivities, it was assumed that they belong 
to different conformations of Gl (subsequently termed Gl 
and Gl*). The observation of two distinct cross peaks 
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Figure 5. Formation of higher order structures/quadruplexes. (A) Hexanucleotides containing five consecutive guanosines at their 3'-terminus (AG 5 
and TG 5 ; lanes #1-4) show a mobility shift in native polyacrylamidegel electrophoresis in the presence of 10 mM K + (complex B) which is even more 
dramatic for hexanucleotides with five guanosines at their 5'-terminus (G 5 A and G 5 T, lanes #5-8) in the presence of 10 mM K + (complex A). G 3 TG 2 
serves as a negative control for five consecutive Gs (lanes #9 and 10). (B) 500 MHz 'H NMR spectra of G 5 T in the absence and presence of 50 mM 
K + (top and middle, respectively) and, for comparison, a monomeric ssDNA hexanucleotide (CCGGAG, bottom). The resonances between 10 and 
11 ppm (middle spectrum) are indicative of imino groups protected by hydrogen bonding. Spectra were recorded at 30°C, pH 7.5. 



between G2-H8 and HI' of Gl and Gl* Gl and Gl* in 
conjunction with an exceedingly weak cross peak between 
Gl and Gl* resonances indicates slow exchange between 
two quadruplex forms containing either Gl or Gl*. 

Starting from the H8 resonances assignment of imino 
peaks of the two quadruplex forms was straightforward 
(Figure 6B). The most prominent diagonal imino peak at 
10.78 ppm (asterisks in Figure 6B) contained the G5-H1 
and G4-H1 diagonal peaks as well as the G1*-H1 diagonal 
peak. The remaining guanine HI peaks were sufficiently 
resolved to observe cross peaks between Gl-Hl and 
G2-H1 as well as between G2-H1 and G3-H1. The add- 
itional cross peak between G3-H1 and the G5, G4 and 
Gl* overlapping HI resonances could arise from the se- 
quential connectivity between G3-H1 and G4-H1. No 
cross peak between the T6-H6 and any imino resonance 
was observed, suggesting that the thymine 3'-end residue is 
not involved in the hydrogen bond network. 



The assigned imino-imino cross peaks are all consistent 
with a parallel stranded quadruplex structure in which 
only sequential Hl-Hl cross peaks would be expected. 
In this interpretation, the cross peak between G2-H1 
and the G5, G4 and Gl* overlapping HI resonances 
would be a G2-G1* imino-imino cross peak. However, 
it is also possible that this peak is in fact a G2-G4 
imino-imino cross peak whose existence would point to 
an antiparallel quadruplex structure in which G2 of one 
strand would be hydrogen bonded to G4 of another 
strand. In such a scenario, G1-G5 imino-imino cross 
peaks would also be expected. While there is clearly no 
such cross peak between G5 and Gl, nothing can be said 
about G5 and Gl* because a cross peak between these 
imino resonances would be too close to the diagonal 
to be resolved. Because no cross peaks between G4 and 
G2 or G5 and Gl are observed in the aromatic region 
(Figure 6A), we consider an antiparallel strand orientation 
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Figure 6. (A) Assignment of G 5 T (aromatic and anomeric regions). G 5 T 'H, 'H-NOESY spectrum (30°C, 250ms mixing time, lOOmM K + , 10% 
D 2 0). Left: Aromatic region. For clarity, the assignments are abbreviated (3G stands for G3-H8 diagonal peak, 3G4G for cross peak between G3-H8 
and G4-H8). Cross peaks between aromatic protons and both their 5'- and 3' -neighboring aromatic protons are observed. G2-H8 exhibits weak cross 
peaks with both Gl and Gl* (see text). Right: Anomeric region showing cross peaks with G5-H8 and T6-H6. For every guanine anomeric proton, 
cross peaks with the aromatic proton belonging to the same nucleotide and with the 3'-neighboring aromatic proton are observed. Assignments are 
again abbreviated (5G stands for the G5-H1'/G5-H8 cross peak, 5G6T for the cross peak between G5-H1' and T6-H6). Unassigned cross peaks 
belong to a second, unidentified conformation. (B) Assignment of G 5 T (imino region). G 5 T 'H, 'H-NOESY spectrum (30°C, 250ms mixing time, 
lOOmM K + , 10% D 2 0), imino region. The peak arising from an overlap of the G5, G4 and Gl* diagonal peaks is denoted as asterisks. The two 
unassigned diagonal peaks belong to the unidentified conformation. While the unassigned resonance at 10.96 ppm has no further cross peaks in the 
imino region, the unassigned resonance at 10.70 ppm may form cross peaks with G3-H8 and the G5, G4 and Gl* overlapping peak, pointing to a 
possible second conformation of one of these residues. 



in G 5 T unlikely. However, given the existence of uniden- 
tified peaks and the low spectral dispersion in the imino 
diagonal it appears that more sophisticated NMR experi- 
ments, including those targeted at 15 N and P nuclei, 
would be needed to unambiguously resolve the multiple 
G 5 T conformations. The spectral dispersion in G 5 T for 
both the aromatic and imino peaks is poor, which is in 



accordance with the previous observation that the absence 
of thymine loops in quadruplex structures leads to 
decreased spectral resolution of guanine protons due to 
the absence of thymine ring currents (31). 

At a 10-fold excess of G 5 T over 3C pro , there is no indi- 
cation that the higher order structure is disrupted and 
monomers are formed (Supplementary Figure S2), but 
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since under these conditions the : H NMR signals reflect 
mainly the free form of G 5 T we cannot completely rule out 
that the overall structure of G 5 T is affected upon binding 
to 3C pro . Slight signal broadening and small chemical 
shift changes were observed for the imino resonances 
upon binding to the protein. In agreement with previous 
studies, which identified monovalent cations at the center 
of guanine-quartets (32,33), addition of lOmM MgCl 2 did 
not produce detectable changes in the G 5 T ID spectrum, 
nor did it alter the 'H, '^N-HSQC spectrum of the 
protein-DNA complex, suggesting that, unlike in other 
nucleic acid-protein complexes, no divalent cations are 
necessary to mediate the HAV 3C pro -G 5 T interaction. 

G 5 T forms quadruplex dimers in solution 

Addition of G 5 T to 3C pro in low salt buffer had a dramatic 
effect on the protein's spectral line width, leading to severe 
signal loss in the 'H, 15 N-HSQC spectrum (Supplementary 
Figure S3). Signals that were observable at a protein: G 5 T 
(monomeric concentration) ratio of 1:4 in the presence of 
20 mM NaCl belonged predominantly to the flexible side 
chains of asparagine and glutamine residues and the pro- 
tein's flexible termini. Comparable spectra are commonly 
observed for proteins of high molecular weight in the 
absence of deuteration and pulse programs optimized 
for large molecular weights. To shed light on the unex- 
pected increase in molecular weight upon 3C pro -G 5 T 
complex formation, DOSY measurements were recorded 
and translational diffusion coefficients extracted for 
free G 5 T and the 3C pro -G 5 T complex (Table 1). 
Surprisingly, the translational diffusion coefficient D 
of free G 5 T at 30°C was found to be 12.4 ± 0.2 x 10" 7 
cm 2 /s for a 1.5 mM sample containing 200 mM KC1, 
slightly smaller than the free protein's diffusion coefficient 
which was determined to 12.8 ± 0.2 x 10~ 7 cm 2 /s. At a 
1:4 protein:DNA ratio (concentration relating to the 
monomeric form of G 5 T) and a salt concentration 
of 20 mM NaCl, the apparent D for the complex 
(analyzed using the protein methyl groups) decreased to 
11.4 ± 0.4 x 10~ 7 cm /s. The apparent D for the complex 
in a sample with the same molar ratios of 3C pro and G 5 T 
but 150mM NaCl was 12.1 ± 0.3 x 10" 7 cm 2 /s, and the 
respective 'H, 15 N-HSQC spectrum showed line widths 
that were comparable to those observed for the free 
protein. Continuous chemical shift changes for 3C pro 
amide cross peaks were observed with the high-salt 
sample upon titration with G 5 T as well as a decrease in 



Table 1. Translational DOSY diffusion coefficients of 3C pro , G 5 T and 
3C pro -G 5 T complexes at 30°C 



Sample 


£>/10~ 7 cm 2 /s 


3C pro 


12.8 ± 0.2 


G,T 


12.4 ± 0.2 


3C pro :G 5 T 1:4, 20 mM NaCl 


11.4 ± 0.4 


3C pro :G 5 T 1:4, 150mM NaCl 


12.1 ± 0.3 


3C pro :G 5 T 1:20, 150mM NaCl 


11.6 ± 0.4 



Relative G 5 T amounts refer to monomeric concentrations. 



the apparent D (11.6 ± 0.4 x 10" 7 cm 2 /s for a 1:20 ratio of 
proteimmonomeric G 5 T). 

These results suggest that the 3C pro -G 5 T complex is in 
fast exchange in the presence of 150mM NaCl and that 
the complex is weakened by the addition of salt. 
Therefore, ionic interactions may be the main driving 
force for the interaction. This observation is in agreement 
with the salt dependency observed in the in vitro inhibition 
of the protease activity. Furthermore, and on the basis 
of previously reported diffusion coefficients for 
G-quadruplexes (31,33), we concluded that G 5 T may in 
fact form a quadruplex dimer in solution (see 'Discussion' 
section). Presumably, the association of two G 5 T 
quadruplexes is promoted by the fact that the 5'-end of 
G 5 T is part of a G-tetrad that is not obstructed by 
preceding 'unstructured' nucleotides. It is thus feasible 
that two parallel G 5 T-quadruplexes align 'head to head' 
with their respective 5'-end orientated toward one another, 
giving rise to extensive aromatic stacking. In such a 
scenario, the additional resonances observed in the 'H, 
'H-NOESY spectra could originate from NOEs between 
the two G 5 T quadruplexes within such a dimer. 

NMR assignment of 3C pro at physiological pH and 'H, 
I5 N-HSQC-monitored binding of G 5 T 

NMR spectra of HAV 3C pro C24S displayed signs of a 
well-folded protein with good dispersion in the ! H and 
15 N dimensions. DOSY measurements yielded a transla- 
tional diffusion coefficient D of 12.8 ± 0.2 x 10~ 7 cm 2 /s 
for a HO^M sample at 30°C, suggesting that the protein 
is predominantly monomeric in the concentration range 
chosen. 

NMR is well-suited for the analysis of weak ligand 
binding to small and intermediate size proteins via 
chemical shift changes and differential line broadening 
caused by direct binding events and allosteric effects 
(34). We conducted a 'H, 15 N-HSQC-monitored titration 
of r5 N-3C pro with G 5 T in the presence of 150mM NaCl. 
ID 'H and 2D 'H, 15 N-HSQC spectra were recorded with 
0, 4, 20 and 50 equivalents of G 5 T (monomeric concentra- 
tions). Both chemical shift changes and selective broaden- 
ing of backbone NH cross peaks were observed during this 
titration experiment (Figure 7A). An assignment of 3C pro 
backbone resonances has been deposited in the BMRB 
databank prior to this work (25). However, the experi- 
mental conditions used for this assignment were very dif- 
ferent from the conditions used in the present study. 
In particular, the original assignment was performed at 
pH 5.4 where 3C pro exhibits <30% proteolytic activity. 
The maximum activity was observed between pH 8 
and 9, and at pH 7.5 around 80% activity was measured 
(data not shown). Because of the difficulties in transferring 
the original assignment to pH 7.5, a reassignment of 3C pro 
was attempted, using 2 H, 1 C and 15 N -labeled protein and 
TROSY variants of two pairs of backbone 3D spectra. 
A sequential assignment strategy was employed as 
far as possible, yielding assignment of 70% of the 
backbone NH resonances (cf. Supplementary Figure S4 
for a representative walk through the protein backbone 
based on HNCACB and HN(CO)CACB spectra). 
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Figure 7. (A) G 5 T binding site on 3C pro mapped by NMR. Details of the 'H, l5 N-HSQC spectra of 3C pro with different concentrations of G 5 T. 
Ratios were 1:0 (black), 1:4 (red), 1:20 (green) and 1:40 (blue) equivalents of G 5 T (monomeric G 5 T concentrations). Examples of residues which 
undergo chemical shift changes are shown (Rill, LI 13 and R115) as well as unaffected residues from the KFRDI motif (F96 and D98). Resonances 
experiencing large chemical shifts are connected by black lines. (B) Chemical shift changes mapped onto 3C pro : Mapping of chemical shift changes 
on 3C prc induced by 10 equivalents G 5 T on the HAV 3C pro crystal structure (PDB entry 1QA7). Red: large shift changes, orange: medium shift 
changes, turquoise: KFRDI motif, purple: catalytic C172. The changes in the 'H, 15 N-HSQC spectrum are mostly confined to the C-terminal (3-barrel 
and preceeding helix. Figure rendered using Sybyl Software (Tripos Assoc.). 
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This strategy was supplemented by the use of a MTSL 
spin label and measurements of paramagnetic relaxation 
enhancements in conjunction with the HAV 3C pro crystal 
structures deposited in the pdb, which allowed another 
5% of residues to be unambiguously assigned. The C a , 
C p , C, N and H backbone resonances were rereferenced 
using CHECKshift (35,36) and deposited in the BMRB 
database under the accession code 16837. 

Chemical shift changes and/or selective peak broaden- 
ing were observed for a set of 3C pro amide NH resonances 
upon addition of G 5 T, indicating their direct or indirect 
involvement in complex formation. Most cross peaks that 
experienced significant chemical shift changes were also 
broadened. The residues with the largest chemical shift 
changes, where an assignment at pH 7.5 was available, 
were Rill, LI 13, R115 and El 39 (Figure 7A and 
Supplementary Figure S5). Only for the first two titration 
steps, cross peaks were observed for R115, which broad- 
ened beyond detection during subsequent titration steps. 
Smaller but significant shift changes were also observed 
for A31, K89, 1104, N180 and 1190. Mapping of these 
residues onto the crystal structure of 3C pro (15) revealed 
that they are almost all part of the C-terminal P-barrel or 
the preceding a-helix (Figure 7B). Only A31 and 1190 are 
in proximity of the catalytic residue CI 72, which belongs 
to the C-terminal P-barrel that is affected by binding of 
G 5 T. Residues of the conserved KFRDI motif, on the 
other hand, did not experience significant chemical shift 
changes or line broadening during titration with G 5 T, as 
can be seen clearly for F96 and D98 (Figure 7A). 



DISCUSSION 

In this study, we identified a small number of 
hexanucleotides with measurable binding affinity to HAV 
3C pro using a hexanucleotide array. Hexanucleotide 
binding does not compete with binding of viral RNA to 
HAV 3C pro . For one of the hexanucleotides identified, G 5 T, 
a detailed analysis of its interaction with HAV 3C pro was 
performed. 

From gel mobility assays and 'H NMR spectroscopy, it 
became evident that this hexanucleotide forms a higher 
order structure with features indicative of G-quadruplex 
formation. Therefore, we studied this possibility using 
DOSY NMR before entering NMR binding studies. 
DOSY experiments delivered translational diffusion coef- 
ficient of 12.4 ± 0.2 x 10" 7 cm 2 /s (at 30°C) for free G 5 T. 
This value is surprisingly small when compared to 3C pro 
(12.8 ± 0.2 x 10"'cm7s), given that the molecular weight 
of 3C pro (~24kDa) is roughly four times the weight 
of the G 5 T quadruplex. However, it has to be taken 
into consideration that in dilute solutions translational 
diffusion reflects on hydrodynamic radii rather than mo- 
lecular weights. A comparison of the crystal structures 
of HAV 3C pro [pdb entries for example 2A40, 1HAV, 
2CXV (15,37,38)] and [TG 4 T] 4 [pdb entry 244D, (32)] 
demonstrates that the size of a parallel stranded 
hexanucleotide G-quadruplex is larger than its molecular 
weight suggests, reaching to roughly half the size of 3C pro . 
This observation, together with literature values for 



similar quadruplexes, leads us to conclude that G5T 
might in fact form a four-stranded quadruplex-dimer 
in solution. The diffusion coefficient for the 32-nt 
G-quadruplex [T 2 G4T 2 ]4 was reported to be 13.4 x 10~ 7 
cm /s at 20°C (33). In another report (31), a diffusion co- 
efficient of 14.2 x 10~ 7 cm 2 /s was found for the 24-nt 
G-quadruplex [G 4 T 4 G 4 ] 2 at 25°C. G5T, also forming a 
24-nt G-quadruplex ([G 5 T] 4 ), would thus diffuse slower 
at 30°C than [G 4 T 4 G 4 ] 2 at 25°C although an increase 
in temperature generally promotes diffusion. Similarly, 
the higher diffusion coefficient for the slightly larger 
G-quadruplex [T 2 G 4 T 2 ]4 at 20° C is not in accordance 
with the assumption that G 5 T is a quadruplex [G 5 T] 4 . 
Rather, these comparisons as well as the similarity in the 
diffusion times of G 5 T and 3C pro suggest that two 
G 5 T-quadruplexes [G 5 T] 4 associate in solution to form 
a dimer of quadruplexes, [[G 5 T] 4 ] 2 . The apparent molecu- 
lar weights of G 5 T and 3C pro were found to be 33 and 
22.8 kDa, respectively, in a gel filtration experiment (data 
not shown). While the latter value is in good agreement 
with the calculated molecular weight for 3C pro (23.9 kDa), 
G 5 T does again appear much larger than this would 
be expected for an individual quadruplex [G 5 T] 4 . We 
speculate that this unusual behavior of G 5 T is a conse- 
quence of the uncapped 5'-end guanosine. This residue is 
likely part of an easily accessible G-tetrad at the 5'-end of 
a parallel quadruplex, allowing a second quadruplex to 
undergo favorable stacking through its own 5'-end. It is 
noteworthy that such a stacking is observed in the crystal 
structure of TG 4 T despite the presence of 'loose' thymine 
residues at either end of the quadruplex (32). 

NMR binding studies were based on 'H, 15 N-HSQC 
spectra of 3C pro in the presence of G 5 T. This allowed 
identification of amino acids that are directly or indirectly 
affected by binding of the hexanucleotide (Figure 7B). Not 
all of the amino acids that display chemical shift changes 
are necessarily in direct contact with G 5 T. Some of the 
effects may be due to conformational changes of 3C pro 
upon binding to G 5 T. For example, the changes in the 
N-terminal P-barrel could be due to disturbances of a 
hydrogen bonding network connecting both barrels. 
Such conformational changes could also be the reason 
for a decreased proteolytic activity in the presence of 
G 5 T. According to the 'H, 15 N-HSQC spectra, the proteo- 
lytic cleft itself is not affected by binding of G 5 T. This 
suggests that the mechanism of inhibition is 
non-competitive. More recently, a competitive allosteric 
mechanism has been described (39) that would also be in 
accordance with G 5 T binding to a site distant from 
the proteolytic cleft. Our attempts to discriminate 
between these two alternatives by kinetic analysis (cf. 
Supplementary Figure SI) were not successful. Also, it 
has to be taken into consideration that chemical shift 
changes could be followed only for a subset of amide 
cross peaks, leaving the possibility that some shift 
changes closer to the proteolytic cleft were overlooked. 
Crystallization attempts are underway to further elucidate 
the mechanism of inhibition. 

It is instructive to compare our results with two re- 
cent studies which have addressed the binding of 
viral RNA to 3C pro of poliovirus and rhinovirus (21,22). 
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Figure 8. Inhibition of HAV replication by G 5 T. A luciferase-based monitor system for replication of HAV (pHAV-Luc) shows 60% inhibition of 
the luciferase expression by G 5 T (bar #3) in comparison to a control hexanucleotide (bar #2) and control plasmid (bars #4-6). Data were compared 
using paired ; test and the program Prism 4 (GraphPad Software) (n = 3). Indicated are mean values ± SEM; **/><0.01. 



In poliovirus 3C pro , the conserved KFRDI motif along 
with the N-terminal a-helix, exhibited chemical shift 
changes when titrated with viral RNA. Similarly, this 
motif was also found to be affected in an NMR titration 
of rhinovirus 3C pro with the untranslated 5'-end of the 
viral RNA. These studies provide further evidence that 
the conserved KFRDI motif, which has previously been 
shown to be involved in the binding of the viral RNA, is 
indeed of central importance for the role of HAV 3C pro in 
genome replication. 

In HAV 3C pro , residues 95-99 contain the KFRDI 
motif, located (as in poliovirus 3C) in the interdomain 
connection loop between the two (3-barrel domains. 
In this study, we found this stretch in HAV 3C pro not 
affected by the addition of G 5 T in ! H, 15 N-HSQC 
spectra. This observation was further substantiated 
by gel shift assays in which addition of G 5 T to the 
viral RNA/3C pro complex caused a super-shift, indicating 
that G 5 T does not compete with viral RNA for binding 
to 3C pro . 

We found that interaction of 3C pro and G 5 T strongly 
depends on ionic strength, suggesting that most molecular 
contacts are mediated by ionic side chains of the protein 
and the anionic backbone of the G-quadruplex. To render 
the complex less sensitive to physiological salt concentra- 
tions, it should be possible to 'supplement' the complex 
with additional hydrophobic interactions, for instance, by 
the addition of thymine loops to the oligonucleotide. 
Because structural prediction of G-quadruplex conform- 
ations is not reliable to date, a systematic array-based 
screening with marginally longer G-rich sequences and 
higher thymine content could be a step toward the 
design of a tighter binding ligand starting from G 5 T. It 
is noteworthy that, despite some structural ambiguity, the 
complex between the thrombin binding aptamer (TBA) 
and thrombin can also be characterized as mediated by 
a number of ionic interactions with the phosphate 
backbone, supplemented by aliphatic interactions with 
single thymine and guanine residues that reach out of 
the TBA helical core (40). 



Regarding the biology and relevance of the interaction 
between G 5 T and HAV 3C pro , it is noteworthy that among 
the 3C pro -binding hexanucleotides identified in this study 
two sequences are present within HAV 5'-UTR, 5'-CCGG 
AG-3' and 5'-AGGCTA-3' (Figure 1). In the context of 
the viral 5'-UTR RNA, these sequence segments are both 
thought to be involved in intramolecular duplex formation 
(41), and therefore would not be available for interactions 
with 3C pro in the same way as they would do as 
single-stranded oligonucleotides. This is consistent with 
the observed independent binding of in vitro transcribed 
HAV 5'-UTR RNA and G 5 T; viral RNA and G 5 T recog- 
nize different sites of 3C pro . Both the primary sequence of 
G 5 T and its propensity to form G-quartets are important 
for site-specific binding and interference with the enzym- 
atic activity of 3C pro . While other hexanucleotides were 
also able to bind HAV 3C pro (Figure 1), these interactions 
were not classified as functional with respect to inhibition 
of the proteolytic activity. The species CCGGAG (group 
I, pos. 25-30 of 5'-UTR RNA; Figure 1) and AGGCTA 
(group III, pos. 91-96 of 5'-UTR RNA) have been 
included in gel shift assays (Figure 2) where they did not 
provide any evidence for the formation of higher order 
complexes such as G-quartets. 'H NMR spectra of these 
compounds also showed all characteristics of monomelic 
single-stranded oligonucleotides (shown for CCGGAG in 
Figure 5B). 

To test whether G 5 T could suppress viral replication, we 
studied the influence of G 5 T on viral functions in cell 
culture-based assays using a luciferase-based monitor 
system (42). An HAV control region-driven and 3C pro - 
dependent luciferase expression plasmid (pHAV-Fluc) 
was transfected in permissive Huh-T7 cells in the 
presence of G 5 T or other hexanucleotides. For control 
purposes, we also replaced pHAV-Fluc by the HAV- 
lacking standard plasmid for luciferase pGL3. A signifi- 
cant suppression of HAV-controlled gene expression was 
observed by G 5 T but not when other hexanucleotides were 
used instead (Figure 8). This is in line with preliminary 
experiments indicating that in the presence of G 5 T 
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but not in the presence of other hexamers the formation of 
infectious recombinant HAV particles in HAV-infected 
Huh-T7 cells was significantly decreased (data not 
shown). These experiments strongly indicate that G 5 T 
acts as an inhibitor of HAV-specific gene expression, pre- 
sumably via its inhibitory effects on 3C pro as suggested by 
the in vitro studies described above. Based on these 
findings, G 5 T might be considered as a starting point for 
the development of pharmaceutically relevant anti-HAV 
compounds. 

Two recent studies reported the use of small oligo- 
nucleotides as inhibitors of picornaviral and hepatitis 
C virus (HCV) internal ribosomal entry sites (IRES)- 
mediated translation (43,44). In one of these studies (43), 
poly(rC) binding proteins 1 and 2 (PCBP1 and 2) 
were found to interact with a short C-rich oligonucleotide 
(CCCCCTT), thereby inhibiting picornaviral IRES- 
dependent translation. Two yet uncharacterized cellular 
proteins which contribute to IRES-mediated translation 
in HCV were quarried in the second study (44). Both 
studies demonstrate how short DNA oligonucleotides 
can be employed to identify non-canonical cellular 
factors which are part of the viral translation machinery. 
While we cannot prove that such cellular factors do not 
exist for HAV 3C pro , on the basis of the extensive charac- 
terization of this interaction in vitro we think that it is 
likely that the inhibition observed in this study is due to 
a direct interaction between G 5 T and 3C pro . 

CONCLUSION 

We employed a range of biochemical and biophysical 
techniques to thoroughly characterize the interaction 
between HAV 3C pro and a small synthetic oligonucleotide. 
Our results, together with previous work in this direction 
(24) support the hypothesis that small nucleotide frag- 
ments such as the DNA hexamers studied here have a 
potential as specific ligands that interfere with protein 
function. We suggest that further modification of such 
ligands may lead to novel inhibitors of virus replication 
in vivo. 
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